Likelihood normalization using an ergodic HMM for continuous speech recognition
نویسنده
چکیده
In recent speech recognition technology, the score of a hypothesis is often de ned on the basis of HMM likelihood. As is well known, however, direct use of the likelihood as a scoring function causes di cult problems especially when the length of a speech segment varies depending on the hypothesis as in word-spotting, and some kind of normalization is indispensable. In this paper, a new method of likelihood normalization using an ergodic HMM is presented, and its performance is compared with those of conventional ones. The comparison is made from three points of view: recognition rate, word-end detection power, and the mean hypothesis length. It is concluded that the proposed method gives the best overall performance.
منابع مشابه
Voice quality normalization in an utterance for robust ASR
In this paper, we propose a novel method of normalizing the voice quality in an utterance for both clean speech and speech contaminated by noise. The normalization method is applied to the N-best hypotheses from an HMM-based classifier, then an SM (Sub-space Method)-based verifier tests the hypotheses after normalizing the monophone scores together with the HMMbased likelihood score. The HMM-SM...
متن کاملPerformance Evaluation of Statistical Approaches for Text Independent Speaker Recognition Using Source Feature
This paper introduces the performance evaluation of statistical approaches for Text-Independent speaker recognition system using source feature. Linear prediction (LP) residual is used as a representation of excitation information in speech. The speaker-specific information in the excitation of voiced speech is captured using statistical approaches such as Gaussian Mixture Models (GMMs) and Hid...
متن کاملIrrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions
In a traditional HMM compensation approach to robust speech recognition that uses Vector Taylor Series (VTS) approximation of an explicit model of environmental distortions, the set of generic HMMs are typically trained from “clean” speech only. In this paper, we present a maximum likelihood approach to training generic HMMs from both “clean” and “corrupted” speech based on the concept of irrel...
متن کاملGlottal Excitation Feature based Gender Identification System using Ergodic HMM
In this paper, through different experimental studies it is demonstrated that the time varying glottal excitation component of speech can be exploited for text independent gender recognition studies. Linear prediction (LP) residual is used as a representation of excitation information in speech. The gender-specific information in the excitation of voiced speech is captured using the Hidden Mark...
متن کاملLinear Transforms in Automatic Speech Recognition: Estimation Procedures and Integration of Diverse Acoustic Data
Linear transforms have been used extensively for both training and adaptation of Hidden Markov Model (HMM) based automatic speech recognition (ASR) systems. Two important applications of linear transforms in acoustic modeling are the decorrelation of the feature vector and the constrained adaptation of the acoustic models to the speaker, the channel, and the task. Our focus in the first part of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996